Goto

Collaborating Authors

 Anau



Nested Slice Sampling: Vectorized Nested Sampling for GPU-Accelerated Inference

Yallup, David, Kroupa, Namu, Handley, Will

arXiv.org Machine Learning

Model comparison and calibrated uncertainty quantification often require integrating over parameters, but scalable inference can be challenging for complex, multimodal targets. Nested Sampling is a robust alternative to standard MCMC, yet its typically sequential structure and hard constraints make efficient accelerator implementations difficult. This paper introduces Nested Slice Sampling (NSS), a GPU-friendly, vectorized formulation of Nested Sampling that uses Hit-and-Run Slice Sampling for constrained updates. A tuning analysis yields a simple near-optimal rule for setting the slice width, improving high-dimensional behavior and making per-step compute more predictable for parallel execution. Experiments on challenging synthetic targets, high dimensional Bayesian inference, and Gaussian process hyperparameter marginalization show that NSS maintains accurate evidence estimates and high-quality posterior samples, and is particularly robust on difficult multimodal problems where current state-of-the-art methods such as tempered SMC baselines can struggle. An open-source implementation is released to facilitate adoption and reproducibility.


Blind Strong Gravitational Lensing Inversion: Joint Inference of Source and Lens Mass with Score-Based Models

Barco, Gabriel Missael, Legin, Ronan, Stone, Connor, Hezaveh, Yashar, Perreault-Levasseur, Laurence

arXiv.org Artificial Intelligence

Score-based models can serve as expressive, data-driven priors for scientific inverse problems. In strong gravitational lensing, they enable posterior inference of a background galaxy from its distorted, multiply-imaged observation. Previous work, however, assumes that the lens mass distribution (and thus the forward operator) is known. We relax this assumption by jointly inferring the source and a parametric lens-mass profile, using a sampler based on GibbsDDRM but operating in continuous time. The resulting reconstructions yield residuals consistent with the observational noise, and the marginal posteriors of the lens parameters recover true values without systematic bias. To our knowledge, this is the first successful demonstration of joint source-and-lens inference with a score-based prior.


A Probabilistic U-Net Approach to Downscaling Climate Simulations

Alipourhajiagha, Maryam, Lemaire, Pierre-Louis, Diouane, Youssef, Carreau, Julie

arXiv.org Artificial Intelligence

Climate models are limited by heavy computational costs, often producing outputs at coarse spatial resolutions, while many climate change impact studies require finer scales. Statistical downscaling bridges this gap, and we adapt the probabilistic U-Net for this task, combining a deterministic U-Net backbone with a variational latent space to capture aleatoric uncertainty. We evaluate four training objectives, afCRPS and WMSE-MS-SSIM with three settings for downscaling precipitation and temperature from $16\times$ coarser resolution. Our main finding is that WMSE-MS-SSIM performs well for extremes under certain settings, whereas afCRPS better captures spatial variability across scales.


Dynamic SBI: Round-free Sequential Simulation-Based Inference with Adaptive Datasets

Lyu, Huifang, Alvey, James, Montel, Noemi Anau, Pieroni, Mauro, Weniger, Christoph

arXiv.org Machine Learning

Simulation-based inference (SBI) is emerging as a new statistical paradigm for addressing complex scientific inference problems. By leveraging the representational power of deep neural networks, SBI can extract the most informative simulation features for the parameters of interest. Sequential SBI methods extend this approach by iteratively steering the simulation process towards the most relevant regions of parameter space. This is typically implemented through an algorithmic structure, in which simulation and network training alternate over multiple rounds. This strategy is particularly well suited for high-precision inference in high-dimensional settings, which are commonplace in physics applications with growing data volumes and increasing model fidelity. Here, we introduce dynamic SBI, which implements the core ideas of sequential methods in a round-free, asynchronous, and highly parallelisable manner. At its core is an adaptive dataset that is iteratively transformed during inference to resemble the target observation. Simulation and training proceed in parallel: trained networks are used both to filter out simulations incompatible with the data and to propose new, more promising ones. Compared to round-based sequential methods, this asynchronous structure can significantly reduce simulation costs and training overhead. We demonstrate that dynamic SBI achieves significant improvements in simulation and training efficiency while maintaining inference performance. We further validate our framework on two challenging astrophysical inference tasks: characterising the stochastic gravitational wave background and analysing strong gravitational lensing systems. Overall, this work presents a flexible and efficient new paradigm for sequential SBI.





ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling

Wang, Xiao, Choi, Jong-Youl, Kurihaya, Takuya, Lyngaas, Isaac, Yoon, Hong-Jun, Xiao, Xi, Pugmire, David, Fan, Ming, Nafi, Nasik M., Tsaris, Aristeidis, Aji, Ashwin M., Hossain, Maliha, Wahib, Mohamed, Wang, Dali, Thornton, Peter, Balaprakash, Prasanna, Ashfaq, Moetasim, Lu, Dan

arXiv.org Artificial Intelligence

Sparse observations and coarse-resolution climate models limit effective regional decision-making, underscoring the need for robust downscaling. However, existing AI methods struggle with generalization across variables and geographies and are constrained by the quadratic complexity of Vision Transformer (ViT) self-attention. We introduce ORBIT-2, a scalable foundation model for global, hyper-resolution climate downscaling. ORBIT-2 incorporates two key innovations: (1) Residual Slim ViT (Reslim), a lightweight architecture with residual learning and Bayesian regularization for efficient, robust prediction; and (2) TILES, a tile-wise sequence scaling algorithm that reduces self-attention complexity from quadratic to linear, enabling long-sequence processing and massive parallelism. ORBIT-2 scales to 10 billion parameters across 65,536 GPUs, achieving up to 4.1 exaFLOPS sustained throughput and 74--98% strong scaling efficiency. It supports downscaling to 0.9 km global resolution and processes sequences up to 4.2 billion tokens. On 7 km resolution benchmarks, ORBIT-2 achieves high accuracy with $R^2$ scores in the range of 0.98--0.99 against observational data.


Multidimensional Distributional Neural Network Output Demonstrated in Super-Resolution of Surface Wind Speed

Goldwyn, Harrison J., Krock, Mitchell, Rudi, Johann, Getter, Daniel, Bessac, Julie

arXiv.org Machine Learning

Accurate quantification of uncertainty in neural network predictions remains a central challenge for scientific applications involving high-dimensional, correlated data. While existing methods capture either aleatoric or epistemic uncertainty, few offer closed-form, multidimensional distributions that preserve spatial correlation while remaining computationally tractable. In this work, we present a framework for training neural networks with a multidimensional Gaussian loss, generating closed-form predictive distributions over outputs with non-identically distributed and heteroscedastic structure. Our approach captures aleatoric uncertainty by iteratively estimating the means and covariance matrices, and is demonstrated on a super-resolution example. We leverage a Fourier representation of the covariance matrix to stabilize network training and preserve spatial correlation. We introduce a novel regularization strategy -- referred to as information sharing -- that interpolates between image-specific and global covariance estimates, enabling convergence of the super-resolution downscaling network trained on image-specific distributional loss functions. This framework allows for efficient sampling, explicit correlation modeling, and extensions to more complex distribution families all without disrupting prediction performance. We demonstrate the method on a surface wind speed downscaling task and discuss its broader applicability to uncertainty-aware prediction in scientific models.